Variable Independence in Markov Decision Problems
نویسنده
چکیده
In decision-theoretic planning, the problem of planning under uncertainty is formulated as a multidimensional, or factoredMDP. Traditional dynamic programming techniques are ine cient for solving factored MDPs whose state and action spaces are exponential in the number of the state and action variables, correspondingly. We focus on exploiting problems' structure imposed by variable independence that implies decomposability of transitional probabilities, rewards, and policies, and is captured by the interaction graph of an MDP, obtained from its in uence diagram. Using the framework of bucket elimination[9], we formulate a variable elimination algorithm elim-meu-id for computing maximum expected utility, given an inuence diagram, and apply it to MDPs. Traditional dynamic programming techniques for solving niteand in nite-horizon MDPs, such as backward induction, value iteration, and policy iteration, can be also viewed as bucket elimination algorithms applied to a particular ordering of the state and decision variables. The time and space complexity of elimination algorithms is O(exp(w o )), where w o the induced width of the interaction graph along the ordering o of its nodes. Unifying framework of bucket elimination makes complexity analysis and variable ordering heuristics developed in constraint-based and probabilistic reasoning applicable to decision-theoretic planning. As we show, selecting \good" orderings improves the e ciency of traditional MDP algorithms.
منابع مشابه
Robustness and Conditional Independence Ideals
We study notions of robustness of Markov kernels and probability distribution of a system that is described by n input random variables and one output random variable. Markov kernels can be expanded in a series of potentials that allow to describe the system’s behaviour after knockouts. Robustness imposes structural constraints on these potentials. Robustness of probability distributions is def...
متن کاملMax - Planck - Institut für Mathematik in den Naturwissenschaften Leipzig Robustness and Conditional Independence Ideals
We study notions of robustness of Markov kernels and probability distribution of a system that is described by n input random variables and one output random variable. Markov kernels can be expanded in a series of potentials that allow to describe the system’s behaviour after knockouts. Robustness imposes structural constraints on these potentials. Robustness of probability distributions is def...
متن کاملA Heuristic Variable Grid Solution Method for POMDPs
Partially observable Markov decision processes (POMDPs) are an appealing tool for modeling planning problems under uncertainty. They incorporate stochastic action and sensor descriptions and easily capture goal oriented and process oriented tasks. Unfortunately, POMDPs are very difficult to solve. Exact methods cannot handle problems with much more than 10 states, so approximate methods must be...
متن کاملExploiting Agent and Type Independence in Collaborative Graphical Bayesian Games
Efficient collaborative decision making is an important challenge for multiagent systems. Finding optimal joint actions is especially challenging when each agent has only imperfect information about the state of its environment. Such problems can be modeled as collaborative Bayesian games in which each agent receives private information in the form of its type. However, representing and solving...
متن کاملAccelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کامل